Overview

Dataset statistics

Number of variables10
Number of observations226537
Missing cells455568
Missing cells (%)20.1%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory17.3 MiB
Average record size in memory80.0 B

Variable types

Numeric10

Alerts

wime_komfort is highly correlated with df_index and 8 other fieldsHigh correlation
wime_sauberkeit is highly correlated with wime_personal and 6 other fieldsHigh correlation
wime_platzangebot is highly correlated with wime_personal and 6 other fieldsHigh correlation
wime_gesamtzuf is highly correlated with wime_personal and 7 other fieldsHigh correlation
wime_preis_leistung is highly correlated with df_index and 6 other fieldsHigh correlation
wime_personal is highly correlated with wime_komfort and 7 other fieldsHigh correlation
wime_puenktlich is highly correlated with wime_personal and 4 other fieldsHigh correlation
wime_fahrplan is highly correlated with wime_personal and 6 other fieldsHigh correlation
df_index is highly correlated with wime_komfort and 1 other fieldsHigh correlation
wime_oes_fahrt is highly correlated with wime_personal and 3 other fieldsHigh correlation
wime_personal has 149074 (65.8%) missing values Missing
wime_komfort has 50395 (22.2%) missing values Missing
wime_sauberkeit has 47232 (20.8%) missing values Missing
wime_puenktlich has 46621 (20.6%) missing values Missing
wime_platzangebot has 45836 (20.2%) missing values Missing
wime_gesamtzuf has 38474 (17.0%) missing values Missing
wime_preis_leistung has 15301 (6.8%) missing values Missing
wime_fahrplan has 8190 (3.6%) missing values Missing
wime_oes_fahrt has 54445 (24.0%) missing values Missing
df_index has unique values Unique
wime_komfort has 2918 (1.3%) zeros Zeros
wime_puenktlich has 4598 (2.0%) zeros Zeros
wime_platzangebot has 6262 (2.8%) zeros Zeros
wime_preis_leistung has 6364 (2.8%) zeros Zeros
wime_fahrplan has 5237 (2.3%) zeros Zeros

Reproduction

Analysis started2022-11-20 08:13:04.122535
Analysis finished2022-11-20 08:13:35.900510
Duration31.78 seconds
Software versionpandas-profiling v3.4.0
Download configurationconfig.json

Variables

df_index
Real number (ℝ≥0)

HIGH CORRELATION
UNIQUE

Distinct226537
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean113441.2574
Minimum0
Maximum229488
Zeros1
Zeros (%)< 0.1%
Negative0
Negative (%)0.0%
Memory size1.7 MiB
2022-11-20T09:13:35.960956image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile11499.8
Q156807
median113441
Q3170075
95-th percentile215382.2
Maximum229488
Range229488
Interquartile range (IQR)113268

Descriptive statistics

Standard deviation65397.92223
Coefficient of variation (CV)0.576491514
Kurtosis-1.199839849
Mean113441.2574
Median Absolute Deviation (MAD)56634
Skewness2.43676751 × 10-5
Sum2.569864213 × 1010
Variance4276888232
MonotonicityNot monotonic
2022-11-20T09:13:36.067207image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
2263351
 
< 0.1%
758131
 
< 0.1%
758441
 
< 0.1%
758461
 
< 0.1%
758271
 
< 0.1%
758261
 
< 0.1%
758251
 
< 0.1%
758241
 
< 0.1%
758081
 
< 0.1%
758091
 
< 0.1%
Other values (226527)226527
> 99.9%
ValueCountFrequency (%)
01
< 0.1%
11
< 0.1%
21
< 0.1%
31
< 0.1%
41
< 0.1%
51
< 0.1%
61
< 0.1%
71
< 0.1%
81
< 0.1%
91
< 0.1%
ValueCountFrequency (%)
2294881
< 0.1%
2293451
< 0.1%
2292871
< 0.1%
2288031
< 0.1%
2287381
< 0.1%
2287171
< 0.1%
2286871
< 0.1%
2284981
< 0.1%
2284201
< 0.1%
2284001
< 0.1%

wime_personal
Real number (ℝ≥0)

HIGH CORRELATION
MISSING

Distinct13
Distinct (%)< 0.1%
Missing149074
Missing (%)65.8%
Infinite0
Infinite (%)0.0%
Mean89.87376769
Minimum0
Maximum100
Zeros688
Zeros (%)0.3%
Negative0
Negative (%)0.0%
Memory size1.7 MiB
2022-11-20T09:13:36.159589image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile50
Q177.77777778
median100
Q3100
95-th percentile100
Maximum100
Range100
Interquartile range (IQR)22.22222222

Descriptive statistics

Standard deviation17.83286506
Coefficient of variation (CV)0.198421247
Kurtosis6.721953352
Mean89.87376769
Median Absolute Deviation (MAD)0
Skewness-2.361162011
Sum6961891.667
Variance318.0110763
MonotonicityNot monotonic
2022-11-20T09:13:36.235306image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
Histogram with fixed size bins (bins=13)
ValueCountFrequency (%)
10050819
 
22.4%
759400
 
4.1%
88.888888895367
 
2.4%
77.777777784839
 
2.1%
66.666666671877
 
0.8%
501711
 
0.8%
44.44444444885
 
0.4%
55.55555556817
 
0.4%
0688
 
0.3%
25431
 
0.2%
Other values (3)629
 
0.3%
(Missing)149074
65.8%
ValueCountFrequency (%)
0688
 
0.3%
11.11111111138
 
0.1%
22.22222222227
 
0.1%
25431
 
0.2%
33.33333333264
 
0.1%
44.44444444885
 
0.4%
501711
 
0.8%
55.55555556817
 
0.4%
66.666666671877
 
0.8%
759400
4.1%
ValueCountFrequency (%)
10050819
22.4%
88.888888895367
 
2.4%
77.777777784839
 
2.1%
759400
 
4.1%
66.666666671877
 
0.8%
55.55555556817
 
0.4%
501711
 
0.8%
44.44444444885
 
0.4%
33.33333333264
 
0.1%
25431
 
0.2%

wime_komfort
Real number (ℝ≥0)

HIGH CORRELATION
MISSING
ZEROS

Distinct13
Distinct (%)< 0.1%
Missing50395
Missing (%)22.2%
Infinite0
Infinite (%)0.0%
Mean78.91358803
Minimum0
Maximum100
Zeros2918
Zeros (%)1.3%
Negative0
Negative (%)0.0%
Memory size1.7 MiB
2022-11-20T09:13:36.312751image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile33.33333333
Q175
median77.77777778
Q3100
95-th percentile100
Maximum100
Range100
Interquartile range (IQR)25

Descriptive statistics

Standard deviation22.84215857
Coefficient of variation (CV)0.2894578632
Kurtosis1.473248838
Mean78.91358803
Median Absolute Deviation (MAD)22.22222222
Skewness-1.241285089
Sum13899997.22
Variance521.7642082
MonotonicityNot monotonic
2022-11-20T09:13:36.389722image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
Histogram with fixed size bins (bins=13)
ValueCountFrequency (%)
10067219
29.7%
7536387
16.1%
77.7777777817438
 
7.7%
88.8888888912623
 
5.6%
66.6666666710817
 
4.8%
5010106
 
4.5%
55.555555565996
 
2.6%
44.444444444845
 
2.1%
02918
 
1.3%
252750
 
1.2%
Other values (3)5043
 
2.2%
(Missing)50395
22.2%
ValueCountFrequency (%)
02918
 
1.3%
11.111111111002
 
0.4%
22.222222221685
 
0.7%
252750
 
1.2%
33.333333332356
 
1.0%
44.444444444845
 
2.1%
5010106
 
4.5%
55.555555565996
 
2.6%
66.6666666710817
 
4.8%
7536387
16.1%
ValueCountFrequency (%)
10067219
29.7%
88.8888888912623
 
5.6%
77.7777777817438
 
7.7%
7536387
16.1%
66.6666666710817
 
4.8%
55.555555565996
 
2.6%
5010106
 
4.5%
44.444444444845
 
2.1%
33.333333332356
 
1.0%
252750
 
1.2%

wime_sauberkeit
Real number (ℝ≥0)

HIGH CORRELATION
MISSING

Distinct13
Distinct (%)< 0.1%
Missing47232
Missing (%)20.8%
Infinite0
Infinite (%)0.0%
Mean79.30538902
Minimum0
Maximum100
Zeros1583
Zeros (%)0.7%
Negative0
Negative (%)0.0%
Memory size1.7 MiB
2022-11-20T09:13:36.470676image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile44.44444444
Q175
median77.77777778
Q3100
95-th percentile100
Maximum100
Range100
Interquartile range (IQR)25

Descriptive statistics

Standard deviation21.44728534
Coefficient of variation (CV)0.2704391922
Kurtosis1.142929158
Mean79.30538902
Median Absolute Deviation (MAD)22.22222222
Skewness-1.102926376
Sum14219852.78
Variance459.9860485
MonotonicityNot monotonic
2022-11-20T09:13:36.547270image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
Histogram with fixed size bins (bins=13)
ValueCountFrequency (%)
10066268
29.3%
7539181
17.3%
77.7777777818011
 
8.0%
88.8888888913642
 
6.0%
5012495
 
5.5%
66.6666666710775
 
4.8%
55.555555565581
 
2.5%
44.444444444508
 
2.0%
253170
 
1.4%
33.333333332167
 
1.0%
Other values (3)3507
 
1.5%
(Missing)47232
20.8%
ValueCountFrequency (%)
01583
 
0.7%
11.11111111606
 
0.3%
22.222222221318
 
0.6%
253170
 
1.4%
33.333333332167
 
1.0%
44.444444444508
 
2.0%
5012495
 
5.5%
55.555555565581
 
2.5%
66.6666666710775
 
4.8%
7539181
17.3%
ValueCountFrequency (%)
10066268
29.3%
88.8888888913642
 
6.0%
77.7777777818011
 
8.0%
7539181
17.3%
66.6666666710775
 
4.8%
55.555555565581
 
2.5%
5012495
 
5.5%
44.444444444508
 
2.0%
33.333333332167
 
1.0%
253170
 
1.4%

wime_puenktlich
Real number (ℝ≥0)

HIGH CORRELATION
MISSING
ZEROS

Distinct13
Distinct (%)< 0.1%
Missing46621
Missing (%)20.6%
Infinite0
Infinite (%)0.0%
Mean88.91320579
Minimum0
Maximum100
Zeros4598
Zeros (%)2.0%
Negative0
Negative (%)0.0%
Memory size1.7 MiB
2022-11-20T09:13:36.627610image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile33.33333333
Q188.88888889
median100
Q3100
95-th percentile100
Maximum100
Range100
Interquartile range (IQR)11.11111111

Descriptive statistics

Standard deviation22.12768183
Coefficient of variation (CV)0.2488683389
Kurtosis6.159402156
Mean88.91320579
Median Absolute Deviation (MAD)0
Skewness-2.509552242
Sum15996908.33
Variance489.6343031
MonotonicityNot monotonic
2022-11-20T09:13:36.708513image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
Histogram with fixed size bins (bins=13)
ValueCountFrequency (%)
100124288
54.9%
7517208
 
7.6%
88.8888888911702
 
5.2%
77.777777787321
 
3.2%
04598
 
2.0%
504093
 
1.8%
66.666666672952
 
1.3%
252093
 
0.9%
44.444444441566
 
0.7%
55.555555561485
 
0.7%
Other values (3)2610
 
1.2%
(Missing)46621
 
20.6%
ValueCountFrequency (%)
04598
 
2.0%
11.11111111629
 
0.3%
22.22222222988
 
0.4%
252093
 
0.9%
33.33333333993
 
0.4%
44.444444441566
 
0.7%
504093
 
1.8%
55.555555561485
 
0.7%
66.666666672952
 
1.3%
7517208
7.6%
ValueCountFrequency (%)
100124288
54.9%
88.8888888911702
 
5.2%
77.777777787321
 
3.2%
7517208
 
7.6%
66.666666672952
 
1.3%
55.555555561485
 
0.7%
504093
 
1.8%
44.444444441566
 
0.7%
33.33333333993
 
0.4%
252093
 
0.9%

wime_platzangebot
Real number (ℝ≥0)

HIGH CORRELATION
MISSING
ZEROS

Distinct13
Distinct (%)< 0.1%
Missing45836
Missing (%)20.2%
Infinite0
Infinite (%)0.0%
Mean80.26031646
Minimum0
Maximum100
Zeros6262
Zeros (%)2.8%
Negative0
Negative (%)0.0%
Memory size1.7 MiB
2022-11-20T09:13:36.808882image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile22.22222222
Q175
median100
Q3100
95-th percentile100
Maximum100
Range100
Interquartile range (IQR)25

Descriptive statistics

Standard deviation26.7383851
Coefficient of variation (CV)0.3331457722
Kurtosis1.385339091
Mean80.26031646
Median Absolute Deviation (MAD)0
Skewness-1.454391955
Sum14503119.44
Variance714.941238
MonotonicityNot monotonic
2022-11-20T09:13:36.911373image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
Histogram with fixed size bins (bins=13)
ValueCountFrequency (%)
10091346
40.3%
7525485
 
11.2%
77.7777777812081
 
5.3%
88.8888888910341
 
4.6%
509523
 
4.2%
66.666666676876
 
3.0%
06262
 
2.8%
254450
 
2.0%
55.555555564096
 
1.8%
44.444444443937
 
1.7%
Other values (3)6304
 
2.8%
(Missing)45836
20.2%
ValueCountFrequency (%)
06262
 
2.8%
11.111111111549
 
0.7%
22.222222222316
 
1.0%
254450
 
2.0%
33.333333332439
 
1.1%
44.444444443937
 
1.7%
509523
 
4.2%
55.555555564096
 
1.8%
66.666666676876
 
3.0%
7525485
11.2%
ValueCountFrequency (%)
10091346
40.3%
88.8888888910341
 
4.6%
77.7777777812081
 
5.3%
7525485
 
11.2%
66.666666676876
 
3.0%
55.555555564096
 
1.8%
509523
 
4.2%
44.444444443937
 
1.7%
33.333333332439
 
1.1%
254450
 
2.0%

wime_gesamtzuf
Real number (ℝ≥0)

HIGH CORRELATION
MISSING

Distinct13
Distinct (%)< 0.1%
Missing38474
Missing (%)17.0%
Infinite0
Infinite (%)0.0%
Mean84.58338725
Minimum0
Maximum100
Zeros2040
Zeros (%)0.9%
Negative0
Negative (%)0.0%
Memory size1.7 MiB
2022-11-20T09:13:37.013211image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile50
Q175
median88.88888889
Q3100
95-th percentile100
Maximum100
Range100
Interquartile range (IQR)25

Descriptive statistics

Standard deviation19.58225893
Coefficient of variation (CV)0.2315142438
Kurtosis3.750400158
Mean84.58338725
Median Absolute Deviation (MAD)11.11111111
Skewness-1.725912029
Sum15907005.56
Variance383.4648649
MonotonicityNot monotonic
2022-11-20T09:13:37.094200image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
Histogram with fixed size bins (bins=13)
ValueCountFrequency (%)
10087829
38.8%
7537915
16.7%
88.8888888920359
 
9.0%
77.7777777816801
 
7.4%
507048
 
3.1%
66.666666676768
 
3.0%
55.555555562801
 
1.2%
44.444444442196
 
1.0%
02040
 
0.9%
251869
 
0.8%
Other values (3)2437
 
1.1%
(Missing)38474
17.0%
ValueCountFrequency (%)
02040
 
0.9%
11.11111111475
 
0.2%
22.22222222915
 
0.4%
251869
 
0.8%
33.333333331047
 
0.5%
44.444444442196
 
1.0%
507048
 
3.1%
55.555555562801
 
1.2%
66.666666676768
 
3.0%
7537915
16.7%
ValueCountFrequency (%)
10087829
38.8%
88.8888888920359
 
9.0%
77.7777777816801
 
7.4%
7537915
16.7%
66.666666676768
 
3.0%
55.555555562801
 
1.2%
507048
 
3.1%
44.444444442196
 
1.0%
33.333333331047
 
0.5%
251869
 
0.8%

wime_preis_leistung
Real number (ℝ≥0)

HIGH CORRELATION
MISSING
ZEROS

Distinct13
Distinct (%)< 0.1%
Missing15301
Missing (%)6.8%
Infinite0
Infinite (%)0.0%
Mean73.89085352
Minimum0
Maximum100
Zeros6364
Zeros (%)2.8%
Negative0
Negative (%)0.0%
Memory size1.7 MiB
2022-11-20T09:13:37.171738image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile25
Q150
median75
Q3100
95-th percentile100
Maximum100
Range100
Interquartile range (IQR)50

Descriptive statistics

Standard deviation26.43170844
Coefficient of variation (CV)0.3577128586
Kurtosis0.2255697471
Mean73.89085352
Median Absolute Deviation (MAD)25
Skewness-0.9196123172
Sum15608408.33
Variance698.6352108
MonotonicityNot monotonic
2022-11-20T09:13:37.252568image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
Histogram with fixed size bins (bins=13)
ValueCountFrequency (%)
10075551
33.4%
7545071
19.9%
5025270
 
11.2%
77.7777777813293
 
5.9%
66.6666666710076
 
4.4%
258402
 
3.7%
88.888888897804
 
3.4%
06364
 
2.8%
55.555555566284
 
2.8%
44.444444446253
 
2.8%
Other values (3)6868
 
3.0%
(Missing)15301
 
6.8%
ValueCountFrequency (%)
06364
 
2.8%
11.111111111219
 
0.5%
22.222222222588
 
1.1%
258402
 
3.7%
33.333333333061
 
1.4%
44.444444446253
 
2.8%
5025270
11.2%
55.555555566284
 
2.8%
66.6666666710076
 
4.4%
7545071
19.9%
ValueCountFrequency (%)
10075551
33.4%
88.888888897804
 
3.4%
77.7777777813293
 
5.9%
7545071
19.9%
66.6666666710076
 
4.4%
55.555555566284
 
2.8%
5025270
 
11.2%
44.444444446253
 
2.8%
33.333333333061
 
1.4%
258402
 
3.7%

wime_fahrplan
Real number (ℝ≥0)

HIGH CORRELATION
MISSING
ZEROS

Distinct13
Distinct (%)< 0.1%
Missing8190
Missing (%)3.6%
Infinite0
Infinite (%)0.0%
Mean83.35784834
Minimum0
Maximum100
Zeros5237
Zeros (%)2.3%
Negative0
Negative (%)0.0%
Memory size1.7 MiB
2022-11-20T09:13:37.333842image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile25
Q175
median100
Q3100
95-th percentile100
Maximum100
Range100
Interquartile range (IQR)25

Descriptive statistics

Standard deviation23.89661238
Coefficient of variation (CV)0.2866750145
Kurtosis2.525530506
Mean83.35784834
Median Absolute Deviation (MAD)0
Skewness-1.682809537
Sum18200936.11
Variance571.0480832
MonotonicityNot monotonic
2022-11-20T09:13:37.409681image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
Histogram with fixed size bins (bins=13)
ValueCountFrequency (%)
100118187
52.2%
7532842
 
14.5%
77.7777777814678
 
6.5%
88.8888888912666
 
5.6%
5010663
 
4.7%
66.666666677748
 
3.4%
05237
 
2.3%
254365
 
1.9%
55.555555563956
 
1.7%
44.444444443745
 
1.7%
Other values (3)4260
 
1.9%
(Missing)8190
 
3.6%
ValueCountFrequency (%)
05237
 
2.3%
11.11111111840
 
0.4%
22.222222221509
 
0.7%
254365
 
1.9%
33.333333331911
 
0.8%
44.444444443745
 
1.7%
5010663
 
4.7%
55.555555563956
 
1.7%
66.666666677748
 
3.4%
7532842
14.5%
ValueCountFrequency (%)
100118187
52.2%
88.8888888912666
 
5.6%
77.7777777814678
 
6.5%
7532842
 
14.5%
66.666666677748
 
3.4%
55.555555563956
 
1.7%
5010663
 
4.7%
44.444444443745
 
1.7%
33.333333331911
 
0.8%
254365
 
1.9%

wime_oes_fahrt
Real number (ℝ≥0)

HIGH CORRELATION
MISSING

Distinct13
Distinct (%)< 0.1%
Missing54445
Missing (%)24.0%
Infinite0
Infinite (%)0.0%
Mean90.68931476
Minimum0
Maximum100
Zeros405
Zeros (%)0.2%
Negative0
Negative (%)0.0%
Memory size1.7 MiB
2022-11-20T09:13:37.488701image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile66.66666667
Q177.77777778
median100
Q3100
95-th percentile100
Maximum100
Range100
Interquartile range (IQR)22.22222222

Descriptive statistics

Standard deviation14.74616729
Coefficient of variation (CV)0.162600934
Kurtosis5.606273125
Mean90.68931476
Median Absolute Deviation (MAD)0
Skewness-2.017445543
Sum15606905.56
Variance217.4494496
MonotonicityNot monotonic
2022-11-20T09:13:37.563961image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
Histogram with fixed size bins (bins=13)
ValueCountFrequency (%)
100107734
47.6%
7523036
 
10.2%
88.8888888916983
 
7.5%
77.7777777812813
 
5.7%
66.666666674225
 
1.9%
503252
 
1.4%
55.555555561485
 
0.7%
44.44444444943
 
0.4%
25586
 
0.3%
0405
 
0.2%
Other values (3)630
 
0.3%
(Missing)54445
24.0%
ValueCountFrequency (%)
0405
 
0.2%
11.11111111106
 
< 0.1%
22.22222222210
 
0.1%
25586
 
0.3%
33.33333333314
 
0.1%
44.44444444943
 
0.4%
503252
 
1.4%
55.555555561485
 
0.7%
66.666666674225
 
1.9%
7523036
10.2%
ValueCountFrequency (%)
100107734
47.6%
88.8888888916983
 
7.5%
77.7777777812813
 
5.7%
7523036
 
10.2%
66.666666674225
 
1.9%
55.555555561485
 
0.7%
503252
 
1.4%
44.44444444943
 
0.4%
33.33333333314
 
0.1%
25586
 
0.3%

Interactions

2022-11-20T09:13:33.025200image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-20T09:13:16.801590image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-20T09:13:18.763807image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-20T09:13:20.016874image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-20T09:13:21.571003image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-20T09:13:22.867736image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-20T09:13:24.954972image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-20T09:13:27.729761image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-20T09:13:30.299117image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-20T09:13:31.608425image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-20T09:13:33.132070image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-20T09:13:17.048590image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-20T09:13:18.919797image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-20T09:13:20.376804image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-20T09:13:21.686586image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-20T09:13:22.996650image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-20T09:13:25.171747image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-20T09:13:28.003533image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-20T09:13:30.473996image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-20T09:13:31.712867image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-20T09:13:33.253297image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-20T09:13:17.282192image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-20T09:13:19.042498image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-20T09:13:20.520451image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-20T09:13:21.819022image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-20T09:13:23.144379image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-20T09:13:25.433640image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-20T09:13:28.259530image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-20T09:13:30.598294image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-20T09:13:31.839034image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-20T09:13:33.372746image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-20T09:13:17.461083image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-20T09:13:19.154117image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-20T09:13:20.647116image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-20T09:13:21.943982image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-20T09:13:23.286790image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-20T09:13:25.693723image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-20T09:13:28.517106image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-20T09:13:30.723120image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-20T09:13:31.966617image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-20T09:13:33.492272image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-20T09:13:17.646004image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-20T09:13:19.266138image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-20T09:13:20.776251image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-20T09:13:22.069514image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-20T09:13:23.432387image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-20T09:13:25.959858image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-20T09:13:28.770239image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-20T09:13:30.845699image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-20T09:13:32.142295image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-20T09:13:33.612038image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-20T09:13:17.817500image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-20T09:13:19.379744image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-20T09:13:20.904507image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-20T09:13:22.199288image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-20T09:13:23.694700image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-20T09:13:26.214741image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-20T09:13:29.021699image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-20T09:13:30.970951image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-20T09:13:32.322762image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-20T09:13:33.730662image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-20T09:13:17.994852image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-20T09:13:19.509455image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-20T09:13:21.029985image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-20T09:13:22.333749image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-20T09:13:23.944016image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-20T09:13:26.468259image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-20T09:13:29.276651image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-20T09:13:31.096180image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-20T09:13:32.450325image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-20T09:13:34.115562image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-20T09:13:18.212114image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-20T09:13:19.632992image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-20T09:13:21.160757image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-20T09:13:22.475032image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-20T09:13:24.189619image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-20T09:13:26.716555image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-20T09:13:29.522382image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-20T09:13:31.225055image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-20T09:13:32.580949image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-20T09:13:34.247314image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-20T09:13:18.409405image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-20T09:13:19.753351image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-20T09:13:21.292843image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-20T09:13:22.601649image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-20T09:13:24.440724image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-20T09:13:26.971642image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-20T09:13:29.778004image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-20T09:13:31.353923image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-20T09:13:32.709191image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-20T09:13:34.368276image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-20T09:13:18.590664image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-20T09:13:19.871944image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-20T09:13:21.426457image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-20T09:13:22.731452image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-20T09:13:24.692718image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-20T09:13:27.467942image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-20T09:13:30.022768image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-20T09:13:31.472091image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-20T09:13:32.856096image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/

Correlations

2022-11-20T09:13:37.642538image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/

Auto

The auto setting is an easily interpretable pairwise column metric of the following mapping: vartype-vartype : method, categorical-categorical : Cramer's V, numerical-categorical : Cramer's V (using a discretized numerical column), numerical-numerical : Spearman's ρ. This configuration uses the best suitable for each pair of columns.
2022-11-20T09:13:37.786827image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/

Spearman's ρ

The Spearman's rank correlation coefficient (ρ) is a measure of monotonic correlation between two variables, and is therefore better in catching nonlinear monotonic correlations than Pearson's r. It's value lies between -1 and +1, -1 indicating total negative monotonic correlation, 0 indicating no monotonic correlation and 1 indicating total positive monotonic correlation.

To calculate ρ for two variables X and Y, one divides the covariance of the rank variables of X and Y by the product of their standard deviations.
2022-11-20T09:13:38.126251image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/

Pearson's r

The Pearson's correlation coefficient (r) is a measure of linear correlation between two variables. It's value lies between -1 and +1, -1 indicating total negative linear correlation, 0 indicating no linear correlation and 1 indicating total positive linear correlation. Furthermore, r is invariant under separate changes in location and scale of the two variables, implying that for a linear function the angle to the x-axis does not affect r.

To calculate r for two variables X and Y, one divides the covariance of X and Y by the product of their standard deviations.
2022-11-20T09:13:38.280851image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/

Kendall's τ

Similarly to Spearman's rank correlation coefficient, the Kendall rank correlation coefficient (τ) measures ordinal association between two variables. It's value lies between -1 and +1, -1 indicating total negative correlation, 0 indicating no correlation and 1 indicating total positive correlation.

To calculate τ for two variables X and Y, one determines the number of concordant and discordant pairs of observations. τ is given by the number of concordant pairs minus the discordant pairs divided by the total number of pairs.
2022-11-20T09:13:38.435072image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/

Phik (φk)

Phik (φk) is a new and practical correlation coefficient that works consistently between categorical, ordinal and interval variables, captures non-linear dependency and reverts to the Pearson correlation coefficient in case of a bivariate normal input distribution. There is extensive documentation available here.

Missing values

2022-11-20T09:13:34.504823image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
A simple visualization of nullity by column.
2022-11-20T09:13:34.805122image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
2022-11-20T09:13:35.601468image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
The correlation heatmap measures nullity correlation: how strongly the presence or absence of one variable affects the presence of another.
2022-11-20T09:13:35.778460image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
The dendrogram allows you to more fully correlate variable completion, revealing trends deeper than the pairwise ones visible in the correlation heatmap.

Sample

First rows

df_indexwime_personalwime_komfortwime_sauberkeitwime_puenktlichwime_platzangebotwime_gesamtzufwime_preis_leistungwime_fahrplanwime_oes_fahrt
0226335NaN25.075.075.0100.0100.0100.0100.075.0
122640075.075.050.0100.050.075.050.075.0100.0
2226141NaNNaNNaNNaNNaNNaN100.0100.0NaN
3226748NaN25.075.075.075.075.075.075.0100.0
4226415NaN100.050.0100.075.0100.050.050.050.0
522640875.075.075.0100.075.075.025.050.0100.0
6226754100.075.075.0100.075.0100.0100.0100.0100.0
7226404NaN75.075.050.0100.0100.075.075.075.0
8226403100.0100.075.0NaN100.075.0100.075.0NaN
922676075.0100.0100.0100.0100.075.075.075.0100.0

Last rows

df_indexwime_personalwime_komfortwime_sauberkeitwime_puenktlichwime_platzangebotwime_gesamtzufwime_preis_leistungwime_fahrplanwime_oes_fahrt
226527880100.077.77777855.555556100.000000100.00000077.77777877.777778100.000000100.000000
226528881NaN100.00000077.777778100.00000077.77777888.88888933.33333344.444444100.000000
226529882NaN77.77777866.666667100.00000066.66666777.77777888.88888977.77777877.777778
226530883100.088.88888977.777778100.00000088.88888988.88888988.88888988.88888988.888889
226531884NaN66.666667100.000000100.000000100.00000088.88888966.666667100.000000100.000000
226532885NaN100.000000100.000000100.000000100.000000100.000000100.000000100.000000100.000000
226533886100.022.22222255.55555611.1111110.00000033.3333330.00000022.22222244.444444
226534887NaN77.77777855.555556100.000000100.00000077.7777780.000000100.000000100.000000
226535888NaN66.66666777.777778100.00000066.66666777.77777855.55555655.55555666.666667
226536913NaN100.000000100.00000088.888889100.00000088.888889100.000000100.000000100.000000